Domain Adaptation of Polarity Lexicon combining Term Frequency and Bootstrapping
نویسندگان
چکیده
In this paper we study several approaches to adapting a polarity lexicon to a specific domain. On the one hand, the domain adaptation using Term Frequency (TF) and on the other hand, the domain adaptation using pattern matching with a BootStrapping algorithm (BS). Both methods are corpus based and start with the same polarity lexicon, but the first one requires an annotated collection of documents while the second one only needs a corpus where it looks for linguistic patterns. The performance of both methods overcomes the baseline system using the general polarity lexicon iSOL. However, although the TF approach achieves very promising results, the BS strategy does not give as much improvement as we expected. For this reason, we have combined both methods in order to take advantage of the positive aspects of each one. With this new approach the results obtained are even better that those with the systems applied individually. Actually, we have achieved a significant improvement of 11.50% (in terms of accuracy) in the polarity classification of the movie reviews with respect to the results achieved with the general purpose lexicon iSOL.
منابع مشابه
Bootstrapping polarity classifiers with rule-based classification
In this article, we examine the effectiveness of bootstrapping supervised machine-learning polarity classifiers with the help of a domain-independent rulebased classifier that relies on a lexical resource, i.e., a polarity lexicon and a set of linguistic rules. The benefit of this method is that though no labeled training data are required, it allows a classifier to capture in-domain knowledge ...
متن کاملAutomatic Extraction of Polar Adjectives for the Creation of Polarity Lexicons
Automatic creation of polarity lexicons is a crucial issue to be solved in order to reduce time and efforts in the first steps of Sentiment Analysis. In this paper we present a methodology based on linguistic cues that allows us to automatically discover, extract and label subjective adjectives that should be collected in a domain-based polarity lexicon. For this purpose, we designed a bootstra...
متن کاملOn the Impact of Seed Words on Sentiment Polarity Lexicon Induction
Sentiment polarity lexicons are key resources for sentiment analysis, and researchers have invested a lot of efforts in their manual creation. However, there has been a recent shift towards automatically extracted lexicons, which are orders of magnitude larger and perform much better. These lexicons are typically mined using bootstrapping, starting from very few seed words whose polarity is giv...
متن کاملSentiment Analysis Based on Expanded Aspect and Polarity-Ambiguous Word Lexicon
This paper focuses on the task of disambiguating polarity-ambiguous words and the task is reduced to sentiment classification of aspects, which we refer to sentiment expectation instead of semantic orientation widely used in previous researches. Polarity-ambiguous words refer to words like” large, small, high, low ”, which pose a challenging task on sentiment analysis. In order to disambiguate ...
متن کاملGenerating Focused Topic-Specific Sentiment Lexicons
We present a method for automatically generating focused and accurate topicspecific subjectivity lexicons from a general purpose polarity lexicon that allow users to pin-point subjective on-topic information in a set of relevant documents. We motivate the need for such lexicons in the field of media analysis, describe a bootstrapping method for generating a topic-specific lexicon from a general...
متن کامل